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AMENDMENT TO THE CLAIMS 

1 . (Currently Amended) A method of compressing a log of linguistic data, the log having a 
plurality of linguistic help query strings, each string including at least two tokens, the method 
comprising: 

applying a compression operation to each string; 

identifyin g d e t e rmining if any two strings that match each other after the 

compression operation; and 
removing one of the two matching strings from the lo g; and 
training a statistical process with the compressed log . 

2. (Previously Presented) The method of claim 1, wherein the log is a log of user-initiated 
inputs to a help interface. 

3. (Previously Presented) The method of claim 2, wherein each string is a query relative to a 
help function. 

4. (Previously Presented) The method of claim 3, wherein each help-related query is relative 
to a computer system. 

5. (Original) The method of claim 1, wherein the compression operation is character-based. 

6. (Original) The method of claim 1, wherein the compression operation is token-based. 

7. (Original) The method of claim 1, wherein the compression operation is subsumption. 

8. (Original) The method of claim 7, wherein subsumption includes applying an impossibility 
condition to selectively compute edit distance. 
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9. (Original) The method of claim 1, and further comprising: 

applying a second compression operation to each string; 

determining if any two strings match each other after the second compression 

operation; and 
removing one of the two matching strings from the log. 

10. (Original) The method of claim 9, wherein the first compression operation is character-based 
and the second compression operation is token based. 

11. (Original) The method of claim 10, and further comprising applying subsumption after the 
second compression operation is complete. 

12. (Original) The method of claim 11, wherein the subsumption operation is repeated for the 
log. 

13. (Canceled) 

14. (Currently Amended) A system for compressing a query log having a plurality of 
linguistic help query strings, each string having a plurality of tokens, the system comprising: 

an input for receiving a raw query log; 
memory for storing the raw query log; 

a processor for applying at least one compression operation to each string, and for 

scanning the modified strings to determine if any match each other so that 

one of the matching strings can be removed; and 
wherein the processor is configured to utilize the an output for providing a 

compressed query log to train a statistical process once the removal is 

complete. 
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15. (Previously Presented) The system of claim 14, wherein each string is a query relative to a 
help function. 

16. (Previously Presented) The system of claim 15, wherein each help-related query is 
relative to a computer system. 

17. (Original) The system of claim 14, wherein the at least one compression operation is 
character-based. 

18. (Original) The system of claim 14, wherein the at least one compression operation is token- 
based. 

19. (Original) The system of claim 14, wherein the at least one compression operation includes 
subsumption. 

20. (Original) The system of claim 19, wherein subsumption includes applying an impossibility 
condition to selectively compute edit distance. 

21. (Original) The system of claim 14, and further comprising: 

applying at least a second compression operation to each string; 

determining if any two strings match each other after the second compression 

operation; and 
removing one of the two matching strings from the log. 

22. (Original) The system of claim 21, wherein the first compression operation is character- 
based and the second compression operation is token based. 
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23. (Original) The system of claim 22, and further comprising applying subsumption after the 
second compression operation is complete. 

24. (Original) The system of claim 23, wherein the subsumption operation is repeated for the 
log. 

25. (Canceled) 



